Studying CSSR Algorithm Applicability on NLP Tasks
نویسندگان
چکیده
CSSR algorithm learns automata representing the patterns of a process from sequential data. This paper studies the applicability of CSSR to some Noun Phrase detection. The ability of the algorithm to capture the patterns behind this tasks and the conditions under which it performs better are studied. Also, an approach to use the acquired models to annotate new sentences is pointed out and, at the sight of all results, the applicability of CSSR to NLP tasks is discussed.
منابع مشابه
Applying Causal-State Splitting Reconstruction Algorithm to Natural Language Processing Tasks
This thesis is focused on the study and use of Causal State Splitting Reconstruction (CSSR) algorithm for Natural Language Processing (NLP) tasks. CSSR is an algorithm that captures patterns from data building automata in the form of visible Markov Models. It is based on the principles of Computational Mechanics and takes advantage of many properties of causal state theory. One of the main adva...
متن کاملME-CSSR: an Extension of CSSR using Maximum Entropy Models
In this work an extension of CSSR algorithm using Maximum Entropy Models is introduced. Preliminary experiments to perform Named Entity Recognition with this new system are presented.
متن کاملEntropy Guided Transformation Learning
This work presents Entropy Guided Transformation Learning (ETL), a new machine learning algorithm for classification tasks. It generalizes Transformation Based Learning (TBL) by automatically solving the TBL bottleneck: the construction of good template sets. We also present ETL Committee, an ensemble method that uses ETL as the base learner. The main advantage of ETL is its easy applicability ...
متن کاملPicking up the pieces: Causal states in noisy data, and how to recover them
Automatic structure discovery is desirable in many Markov model applications where a good topology (states and transitions) is not known a priori. CSSR is an established pattern discovery algorithm for stationary and ergodic stochastic symbol sequences that learns a predictively optimal Markov representation consisting of so-called causal states. By means of a novel algebraic criterion, we prov...
متن کاملOptimizing to Arbitrary NLP Metrics using Ensemble Selection
While there have been many successful applications of machine learning methods to tasks in NLP, learning algorithms are not typically designed to optimize NLP performance metrics. This paper evaluates an ensemble selection framework designed to optimize arbitrary metrics and automate the process of algorithm selection and parameter tuning. We report the results of experiments that instantiate t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 39 شماره
صفحات -
تاریخ انتشار 2007